1 


Stochastic  Constraints  for  Fast  Image 
Correspondence  Search  with  Uncertain  Terrain 

Model 

Michael  Veth,  Student  Member,  IEEE ,  John  Raquet,  Member,  IEEE ,  Meir  Pachter,  Fellow,  IEEE, 

Air  Force  Institute  of  Technology 


Abstract —  The  navigation  state  (position,  velocity,  and  attitude) 
can  be  determined  using  optical  measurements  from  an  imaging 
sensor  pointed  toward  the  ground.  Extracting  navigation  infor¬ 
mation  from  an  image  sequence  depends  on  tracking  the  location 
of  stationary  objects  in  multiple  images,  which  is  generally 
termed  the  correspondence  problem.  This  is  an  active  area  of 
research  and  many  algorithms  exist  which  attempt  to  solve  this 
problem  by  identifying  a  unique  feature  in  one  image  and  then 
searching  subsequent  images  for  a  feature  match.  In  general, 
the  correspondence  problem  is  plagued  by  feature  ambiguity, 
temporal  feature  changes,  and  occlusions  which  are  difficult  for  a 
computer  to  address.  Constraining  the  correspondence  search  to 
a  subset  of  the  image  plane  has  the  dual  advantage  of  increasing 
robustness  by  limiting  false  matches  and  improving  search  speed. 
A  number  of  ad-hoc  methods  to  constrain  the  correspondence 
search  have  been  proposed  in  the  literature. 

In  this  paper,  a  rigorous  stochastic  projection  method  is 
developed  which  constrains  the  correspondence  search  space 
by  incorporating  a  priori  knowledge  of  the  aircraft  navigation 
state  using  inertial  measurements  and  a  statistical  terrain  model. 
The  stochastic  projection  algorithm  is  verified  using  Monte 
Carlo  simulation  and  flight  data.  The  constrained  correspondence 
search  area  is  shown  to  accurately  predict  the  pixel  location  of 
a  feature  with  an  arbitrary  level  of  confidence,  thus  promising 
improved  speed  and  robustness  of  conventional  algorithms. 


I.  Introduction 

IT  is  well-known  that  optical  measurements  provide  ex¬ 
cellent  navigation  information,  when  interpreted  properly. 
Optical  navigation  is  not  new.  Pilotage  is  the  oldest  and  most 
natively  familiar  form  of  navigation  to  humans  and  other 
animals.  For  centuries,  navigators  have  utilized  mechanical 
instruments  such  as  astrolabes,  sextants,  and  driftmeters  [12] 
to  make  precision  observations  of  the  sky  and  ground  in  order 
to  determine  their  position,  velocity,  and  attitude. 

The  difficulty  in  using  optical  measurements  for  au¬ 
tonomous  navigation,  that  is,  without  human  intervention,  has 
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always  been  in  the  interpretation  of  the  image,  a  difficulty 
shared  with  Automatic  Target  Recognition  (ATR).  Indeed, 
when  celestial  observations  are  used,  the  ATR  problem  in  this 
structured  environment  is  tractable  and  automatic  star  trackers 
are  widely  used  for  space  navigation  and  ICBM  guidance. 
When  ground  images  are  to  be  used,  the  difficulties  associated 
with  image  interpretation  are  paramount.  At  the  same  time,  the 
problems  associated  with  the  use  of  optical  measurements  for 
navigation  are  somewhat  easier  than  ATR.  Moreover,  recent 
developments  in  feature  tracking  algorithms,  miniaturization, 
and  reduction  in  cost  of  inertial  sensors  and  optical  imagers, 
aided  by  the  continuing  improvement  in  microprocessor  tech¬ 
nology,  motivates  us  to  consider  using  inertial  measurements 
to  aid  the  task  of  feature  tracking  in  image  sequences. 

The  methods  are  typically  classified  as  either  feature-based 
or  optic  flow-based,  depending  on  how  the  image  correspon¬ 
dence  problem  is  addressed.  Feature-based  methods  determine 
correspondence  for  “landmarks”  in  the  scene  over  multiple 
frames,  while  optic  flow-based  methods  typically  determine 
correspondence  for  a  whole  portion  of  the  image  between 
frames.  A  good  reference  on  image  correspondence  is  [7]. 
Optic  flow  methods  have  been  proposed  in  the  literature  gener¬ 
ally  for  elementary  motion  detection,  focusing  on  determining 
relative  velocity,  angular  rates,  or  for  obstacle  avoidance  [4]. 

Feature  tracking-based  navigation  methods  have  been  pro¬ 
posed  both  for  fixed-mount  imaging  sensors  or  gimbal 
mounted  detectors  which  “stare”  at  the  target  of  interest, 
similar  to  the  gimballed  infrared  seeker  on  heat-seeking,  air-to- 
air  missiles.  Many  feature  tracking-based  navigation  methods 
exploit  knowledge  (either  a  priori ,  through  binocular  stereop- 
sis,  or  by  exploiting  terrain  homography)  of  the  target  location 
and  solve  the  inverse  trajectory  projection  problem  [1],  [10]. 
If  no  a  priori  knowledge  of  the  scene  is  provided,  egomotion 
estimation  is  completely  correlated  with  estimating  the  scene. 
This  is  referred  as  the  structure  from  motion  (SFM)  problem. 
A  theoretical  development  of  the  geometry  of  fixed-target 
tracking,  with  no  a  priori  knowledge  is  provided  in  [11].  An 
online  (Extended  Kalman  Filter-based)  method  for  calculating 
a  trajectory  by  tracking  features  at  an  unknown  location  on 
the  Earth’s  surface,  provided  the  topography  is  known  is 
given  in  [3].  Finally,  navigation-grade  inertial  sensors  and 
terrain  images  collected  on  a  T-38  “Talon”  are  processed 
and  the  potential  benefits  of  optical-aided  inertial  sensors  are 
experimentally  demonstrated  in  [14]. 

Many  methods  for  solving  the  correspondence  problem  have 
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been  proposed  in  the  computer  vision  literature.  A  popular 
algorithm  is  the  Lucas-Kanade  feature  tracker  [6],  which 
relies  on  the  premise  of  the  invariance  of  the  intensity  field 
between  images.  It  uses  a  template  correlation  algorithm  to 
minimize  the  sum  of  squared  differences  (SSD)  between  image 
intensities.  The  algorithm  typically  assumes  a  linear  (x  —  y 
plane)  motion  model,  but  can  be  extended  to  optimize  over 
affine  or  bilinear  transformations.  Other  feature  correspon¬ 
dence  algorithms  have  been  proposed  which  are  invariant 
to  rotations,  scaling  or  both,  (e.g.,  [5])  More  robust  feature 
tracking  algorithms  are  typically  computationally  expensive 
and  a  designer  must  trade  tracking  robustness  and  accuracy 
for  real-time  performance. 

This  paper  proposes  an  approach  to  optimize  the  feature 
tracking  problem  by  exploiting  navigation  information,  derived 
from  six  degree-of-freedom  inertial  measurements,  and  prior 
terrain  information,  to  constrain  the  correspondence  search 
space  and  aid  the  attendant  optimization  algorithm.  The  theory 
is  developed  for  a  kinematic  motion  model  with  inertial 
sensors. 

The  paper  is  organized  as  follows.  Section  II  explores 
current  approaches  for  constraining  correspondence  searches 
and  discusses  the  strengths  and  weaknesses  of  such.  Section  III 
poses  the  statistical  projection  problem  in  the  most  general 
terms.  Reasonable  assumptions  are  proposed  which  make  the 
general  problem  tractable  for  use  with  an  Extended  Kalman 
Filter  algorithm.  In  Section  IV,  the  mathematical  model  used  to 
describe  the  navigation  state  and  navigation  state  uncertainty 
is  presented.  This  includes  definition  of  reference  frames, 
navigation  dynamics,  perturbation  model,  and  defines  the 
initial  conditions.  Section  V  builds  upon  the  mathematical 
model  to  derive  the  stochastic  projection  method.  The  resulting 
equations  allow  the  user  to  predict  the  pixel  location  and 
uncertainty  of  a  feature  between  two  images.  The  stochastic 
projection  method  is  validated  in  Section  VI  using  Monte 
Carlo  simulation  and  flight  data.  Finally,  conclusions  are 
drawn  regarding  the  performance  of  the  method  in  Section  VII. 

II.  Current  Correspondence  Constraint 
Approaches 

Exploiting  inertial  measurements  to  constrain  the  corre¬ 
spondence  search  has  been  proposed  in  the  literature.  In  this 
section,  two  methods  which  exploit  inertial  measurements  are 
discussed. 

Bhanu  and  Roberts  [2]  utilize  inertial  measurements  to 
compensate  for  rotation  between  images  and  to  predict  the 
focus  of  expansion  in  the  second  image.  Once  the  second 
image  is  derotated  and  the  focus  of  expansion  is  established, 
the  correspondence  between  points  of  interest  is  calculated 
using  goodness-of-fit  metrics.  One  relevant  metric  is  the 
correspondence  search  constraint  placed  on  each  point.  This 
constraint  ensures  each  interest  point  lies  in  a  cone-shaped 
region,  with  apex  at  the  focus  of  expansion,  bisected  by  the 
line  joining  the  focus  of  expansion  and  the  interest  point  in 
the  camera  frame  at  the  first  image  time.  While  this  constraint 
is  not  statistically  rigorous,  it  does  show  the  value  of  using 
inertial  measurements  to  aid  the  correspondence  problem. 


Fig.  1.  Correspondence  search  constraint  using  epipolar  lines.  Given  a 
projection  of  an  arbitrary  point  in  an  initial  image,  combined  with  knowledge 
of  the  translation  and  rotation  to  a  second  image,  the  correspondence  search 
can  be  constrained  to  an  area  near  the  epipolar  line.  Note  the  epipole  can  be 
located  outside  of  the  image  plane,  as  shown  in  this  example. 

Strelow  also  incorporates  inertial  measurements  to  constrain 
the  correspondence  search  between  image  frames  [15].  This 
constraint  on  the  image  search  space  is  a  similar  concept  to 
the  field  of  expansion  method  proposed  by  Bhanu;  however, 
Strelow  generalizes  the  approach  by  exploiting  epipolar  ge¬ 
ometry. 

The  projection  of  an  arbitrary  point  in  an  image  is  described 
by  an  epipolar  line  in  a  second  image.  All  epipolar  lines 
in  an  image  converge  at  the  projection  of  the  focus  of  the 
complimentary  image.  Combining  knowledge  of  the  transla¬ 
tion  and  rotation  between  images  and  the  pixel  location  of  a 
candidate  target  in  the  first  image,  a  correspondence  search  can 
then  be  constrained  to  an  area  “near”  the  epipolar  line.  This 
approach  is  illustrated  in  Fig.  1.  Strelow’s  method  of  using 
inertial  measurements  to  constrain  the  correspondence  search 
along  an  epipolar  line  is  ad-hoc,  since  the  search  space  is  not 
defined  statistically. 

In  the  next  Section,  the  correspondence  problem  is  described 
using  a  stochastic  model.  This  model  is  then  used  to  determine 
a  statistically-rigorous  correspondence  search  area. 

III.  General  Problem  Formulation 

The  general  problem  is  described  as  follows.  Given  a 
pixel  location  of  a  specified  landmark  at  time  ti ,  predict  the 
probability  density  function  of  the  pixel  location  of  the  same 
landmark  at  time  ti+ Prior  information  regarding  the  vehicle 
navigation  state,  terrain  statistics,  and  the  dynamics  of  the 
vehicle  and  landmark  are  exploited. 

Mathematically,  the  pixel  location,  z (ti),  corresponding  to 
a  landmark  at  location  y (ti)  in  the  scene,  is  governed  by  the 
nonlinear  projection  function 

z(ti)  =h[x(ti),y(ti),ti)  (1) 

where  xfi,  )  represents  the  navigation  state  at  the  time  of  the 
measurement. 

The  vehicle  and  landmark  dynamics  are  modeled  by  the 
following  non-linear  Ito  stochastic  differential  equations  in 
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white  noise  notation, 

x=f  [x(t) ,  u(t) ,  t]  +  Gx  [x(t) ,  t\  wx  it)  (2) 

y=r  \y(t),t]  +  Gy  [y{t),  t]  w y(t)  (3) 

where  u (t)  is  a  known  input  function,  and  w x(t)  and  w y(t) 
are  white  noise  processes. 

A  theoretical  formulation  exists  for  this  general  problem, 
however  two  issues  make  this  solution  intractable.  First,  the 
measurement  observation  function  is  ill-posed  (i.e.,  a  unique 
inverse  does  not  exist).  Second,  propagating  the  conditional 
probability  density  function  in  time  requires  solving  the  for¬ 
ward  Kolmogorov  (i.e.,  Fokker-Planck)  second-order  partial 
differential  equation  for  an  infinite  number  of  moments  [9]. 

To  make  the  problem  tractable,  the  following  reasonable 
assumptions  are  made: 

•  The  prior  knowledge  of  the  navigation  and  target  state 
can  be  adequately  described  as  a  multivariate  Gaussian 
distribution. 

•  Additive  measurement  noise  is  zero-mean,  Gaussian,  and 
white. 

•  Stochastic  process  noise  is  zero-mean,  Gaussian,  and 
white. 

•  The  nonlinear  state  dynamics  and  measurement  equations 
can  be  adequately  modeled  using  perturbation  techniques. 

Although  not  required  for  tractability,  additional  assumptions 
are  made  to  simplify  the  development  and  clarify  the  underly¬ 
ing  concepts.  First,  the  landmark  is  assumed  to  be  stationary 
with  respect  to  the  surface  of  the  Earth  (i.e.,  r[y (£),£]  = 
0.)  Second,  the  camera  is  rigidly  mounted  to  the  vehicle 
with  known  alignment  and  calibration.  Third,  the  terrain  is 
described  by  a  statistical  elevation  model. 

In  the  next  section,  the  relevant  reference  frames  and  the 
vehicle  dynamics  are  defined. 

IV.  Mathematical  Model 

A.  Reference  Frames 

In  this  paper,  three  reference  frames  are  used.  Variables 
expressed  in  a  specific  reference  frame  are  indicated  using 
superscript  notation.  The  Earth-Centered  Earth-Fixed  (ECEF, 
or  e  frame)  is  a  Cartesian  system  with  the  origin  at  the 
Earth’s  center,  the  xe  axis  pointing  toward  the  intersection 
of  the  equator  and  the  prime  (Greenwich)  meridian,  the  ze 
axis  extending  through  the  North  pole,  and  the  ye  axis  is  the 
orthogonal  compliment  (in  this  paper,  a  carat  symbol, ",  denotes 
a  unit  vector).  The  navigation  state  is  expressed  in  the  e  frame. 

The  vehicle  body  frame  (or  b  frame)  is  a  Cartesian  system 
with  origin  at  the  vehicle  center  of  gravity,  the  xb  axis  extend¬ 
ing  through  the  vehicle’s  nose,  the  yh  axis  extending  through 
the  vehicle’s  right  side,  and  the  zb  axis  points  orthogonally 
out  the  bottom  of  the  vehicle.  The  inertial  measurements  are 
expressed  in  the  b  frame. 

The  camera  frame  (or  c  frame)  is  a  Cartesian  system  with 
origin  at  the  center  of  the  camera  image  plane,  the  xc  axis 
is  parallel  to  the  camera  image  plane  and  defined  as  “camera 
up”,  the  yc  axis  is  parallel  to  the  camera  image  plane  and 
defined  as  “camera  right”,  and  the  £c  axis  points  out  of  the 
camera  aperture,  orthogonal  to  the  image  plane. 


Fig.  2.  Camera  frame  illustration.  The  camera  reference  frame 
originates  at  the  center  of  the  focal  plane. 


B.  Vehicle  State  and  Dynamics 

The  vehicle  state  of  interest  consists  of  position  (pe), 

velocity  (ve),  and  direction  cosine  matrix  of  the  body  to  ECEF 
frame  (Cg).  From  [16],  the  vehicle  state  kinematics  are 

Ve  (4) 

cebib  -  2$2®eve  +  ge  (5) 

cgn?h  -  n?ecg  (6) 

where  f6  is  the  specific  force  vector  measured  by  the  ac¬ 
celerometers,  Vt\e  is  the  Earth’s  sidereal  angular  rate  vector 
in  skew-symmetric  form,  ge  is  the  gravitational  acceleration 
vector,  and  Vtbb  is  the  angular  rate  of  the  vehicle  relative  to 
the  inertial  frame  in  skew- symmetric  form  and  measured  by 
the  gyroscopes. 


P  = 


(~ie  _ 


C.  Perturbation  Model 

The  navigation  errors  are  defined  as  differences  from  a 
nominal  trajectory  and  are  represented  as  a  position  error 
(£pe),  a  velocity  error  (£ve),  and  an  attitude  error  (e)  vector, 
defined  as: 


P  e(t)  =  pe(t)  +  5pe(t) 

(7) 

ve(i)  =  ve(t)  +  8ve(t) 

(8) 

C  eb{t)  =  [I3-(e(t)x)}Ceb(t) 

(9) 

where  the  tilde  represents  a  nominal  parameter.  The  error  state 
is  modeled  as  a  zero-mean  Gaussian  random  vector 


Sx(t)  = 


Spe(t) 

5ve(t) 

e(t) 


(10) 


with  covariance  defined  as 


El8x(t)6xT(t)}  =  P  xx(t) 

where  E[-\  is  the  expectation  operator. 


(11) 
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Using  perturbation  techniques,  the  dynamics  of  the  naviga¬ 
tion  error  states  are  modeled  as  a  linear  stochastic  differential 
equation  [8] 

Sk(t)  =  F(t)6x.(t)  +  Gx[x.e(t),t]wx(t )  (12) 

where  w x(t)  is  a  zero-mean,  white  Gaussian  noise  process 
with  covariance  kernel 


Applying  perturbation  techniques  to  the  landmark  position 
function,  the  landmark  error,  S ye,  can  be  expressed  as  a  linear 
function  of  the  errors  of  the  navigation  state,  terrain  model, 
and  pixel  measurement  model 

Sye  =  Gyx5x.  +  GyhSh  +  Gyzw(ti)  (19) 

where  the  influence  coefficients 


E[wx(t)wl(t  +  T)\  =  Qx(t)8(T).  (13) 


D.  Initial  Conditions 

At  the  time  of  the  first  image,  ti,  the  navigation  error  state 
is  a  zero-mean  Gaussian  random  variable  with  covariance, 
PXX{U )•  The  terrain  elevation,  h,  is  a  random  variable  with 
mean,  A,  and  variance,  o\.  The  terrain  elevation  errors  are 
assumed  to  be  independent  of  the  navigation  errors. 


G 

G 

G 


dg 

yx  = 

yh  = 

dg 

dh 

xA,z(ti),c^n 

yz  = 

dg 

dz 

x,h,z(U),C*,  n 

Sh  =  h  —  h 


(20) 

(21) 

(22) 

(23) 


V.  Stochastic  Projection  Theory 

The  theory  is  divided  into  three  sections:  estimating  the 
initial  landmark  position  and  covariance  based  on  the  pixel 
location  of  the  feature  selected  in  the  first  image,  using  inertial 
measurements  to  propagate  this  augmented  state  to  the  time 
of  the  second  image,  and  projecting  this  landmark  position 
state  onto  the  second  image  as  a  probability  density  function 
in  pixel  coordinates.  In  simpler  terms,  these  equations  allow 
us  to  “predict”  where  a  stationary  feature  should  appear  in 
subsequent  images,  thus  providing  a  statistical  measure  to 
constrain  our  search  space  within  the  image. 


Using  the  linearized  position  measurement,  the  landmark 
error  is  a  zero-mean,  Gaussian  random  vector.  The  land¬ 
mark  error  covariance,  P yy(ti),  and  cross-correlation  matrices, 
P  yX(ti)9  are  defined  as 

P  yy(U)  =  E[SySyT ]  (24) 

P  yx{U)  =  £[<5y<5xT]  (25) 

Substituting  (19)  into  (24),  and  noting  the  independence  be¬ 
tween  navigation  state,  terrain,  and  pixel  measurement  errors 
yields: 


A.  Landmark  Error  Statistics 

The  landmark  position  corresponding  to  a  pixel  location  is 
a  non-linear  function  of  the  navigation  state,  pixel  location, 
z (t{),  terrain  elevation,  h ,  camera  to  body  direction  cosine 
matrix,  C[!,  and  homogeneous  camera  projection  matrix,  II 
(see  [7]  for  a  description): 

ye  =  g  [p  e(U),  ct(u),  z(ti),  h,  cb,  n]  (14) 

The  pixel  location  measurement  at  time  ti  is  a  non-linear 
function  of  the  navigation  state,  landmark  position,  and  camera 
parameters: 

Z  (ti)  =  h[p  e(ti),Cl(ti),ye(ti),Cbc,n\ 

+v(i<)  (15) 


where  v(^)  is  a  zero-mean,  additive  white  Gaussian  noise 
process  with: 

£[v(t,)v(y]  =  {  R<'->  =  \‘  (16) 

Similarly  to  the  navigation  state,  the  calculated  landmark 
position,  ye,  is  also  modeled  as  a  perturbation  about  the  true 
position: 

ye  =  ye  +  Sye  (17) 


and  is  a  function  of  the  calculated  trajectory 


y  =  g 


pe(ti),cuti),i(u),cbc,n 


(18) 


P yy(ti)  —  G, /xE[SxSx 

+GyhE[Sh2}Gyh 

+GyzE[v(ti)vT{ti)}GTyz  (26) 


Substituting  the  previously  defined  covariance  matrices  for  the 
navigation  errors,  terrain,  and  pixel  measurement  yields  the 
final  form  of  the  landmark  position  error  covariance. 


Pyyi^i) 


GyxPxx(ti)Gyx  +  Gyh&hGyh 

+GyzRG?z  (27) 


The  cross-correlation  matrices  are  calculated  in  a  similar 
manner  and  are  expressed  as: 


p  xy(ti)  =  p  xx{ti)GyX  (28) 

P  yxiti)  =  GyxPxx(ti)  (29) 


B.  State  Propagation 

In  this  section,  the  nominal  navigation  state,  navigation  error 
state,  and  landmark  error  states  are  propagated  from  time  ti 
to  ti- )-i. 

The  nominal  aircraft  navigation  state  is  propagated  forward 
based  on  the  non-linear  dynamics  model  given  in  Equations 
(4-6),  typically  using  a  non-linear  differential  equation  solver 
(e.g.,  Runge-Kutta)  [13]. 

The  landmark  error  dynamics  are  defined  as  a  random  walk: 

Sy  =  GyWy(t)  (30) 
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where  w y(t)  is  a  zero-mean,  white  Gaussian  noise  process 
with  covariance  kernel 

E[wy(t)w^(t  +  r)]  =  Qj >6(t)  (31) 

The  navigation  error  stochastic  differential  equation  is  defined 
in  Equation  (12)  as 

S±(t)  =  F(t)6x(t)  +  Gx[xe(t),t]wx(t)  (32) 

The  navigation  and  landmark  error  covariance  propagation 
dynamics  are  derived  using  the  linearized  dynamics  mod¬ 
els  (12), (13), (30), (31)  [9]: 


P  xx(t) 

=  F(t)Pxx(t)  +  Pxx(t)FT(t ) 

+Gx(t)Qx(t)Gx(t) 

(33) 

P  xy{t) 

=  F(t)Pxy(t) 

(34) 

P  yy(t) 

= 

(35) 

An  equivalent  expression  for  the  time  propagation  is  rep¬ 
resented  by  the  state  transition  matrix,  3>(^+i,^),  which 
projects  the  navigation  and  landmark  error  covariance  from 
time  ti  to  ti+ 1  [8].  The  resulting  expression  for  the  navigation 
and  landmark  error  covariance  is 

PXx{ti+l)  = 

+  f  4>(ii+i,  t)GxQxG^  ■ 


$T(ti+i,r)dr 

(36) 

P xy  (ti+x) 

—  *&(ti-\-l  1  t%)P xy  (ti) 

(37) 

Pyy(ti+l) 

=  P  yyifi) 

+(U+ 1  ~  U)GyQ yGy 

(38) 

C.  Projection  of  Uncertainty  Statistics  onto  Image 

The  pixel  projection  function  is  used  to  project  the  naviga¬ 
tion  state  and  landmark  location  into  the  image  plane  at  time 
U+ 1.  The  pixel  projection  is 

z(*i+ 1 )  =h[pe (U+i) ,  G% (tm ) ,  ye (ti+ 1 ) , 

Cbc,n\  (39) 

The  estimated  pixel  location  error,  5z(ti+i),  is  modeled  as  a 
perturbation  about  the  nominal  pixel  location 

Sz(ti+1)  =  z(ti+1)  -  z(ti+1)  (40) 

where  the  nominal  pixel  location,  z(^+i),  is  calculated  using 
the  nominal  navigation  state  and  landmark  position 

z(ti+1)  =  h[pe(ti+1),Cl(ti+1),ye(ti+1), 

Cc,n]  (4i) 

Perturbing  the  pixel  projection  function,  the  pixel  location 

error  can  be  expressed  as  a  linear  function  of  the  errors  of  the 
navigation  state  and  landmark  position: 

Sz(ti+1)  =  HzxSx{ti+1 )  +  Hzy5y(ti+1 )  (42) 


where 


dh 

H_,  = 

x,y,C&,n 

(43) 

II 

K 

x,y,C^,II 

(44) 

The  pixel  error  covariance,  Pzz(U+i),  is  defined  as 

P;,(/,h)  =  E[Sz5zt }  (45) 

Substituting  (42)  into  (45),  and  eliminating  independent  error 
sources  yields  the  pixel  location  covariance: 

PzziU+i)  =  H^P^(^+i)H^ 

~^~F^-zx^  xy 

xy{ti+l)Flzx 

-'r^-zyP  yy{fi+l)^-Zy  (46) 

Finally,  the  covariance  of  the  pixel  location  errors  can 
be  summarized  by  combining  the  equations  presented  in  the 
previous  sections: 

P  zz(U+l)=^zx^(ti+l,ti)Pxx{U)^T  {U+1,U)^TZX 

+H-ZX  J  • 

$T(ti+1,T)dTttlx 

+HZx&(U+l,U)Pxx(U)GyXH-Ty 

(t^)^  •>  ti)Flzx 

-\-HZy£xyxPxx(ti)£xyX^-zy 

+HZyGyhalGyh'H.Ty 

+nzyGyznGTziil 

+(^i+i  —  U^zyGyQyGyll^y  (47) 

This  equation  shows  how  an  initial  covariance,  PXX(U ), 
height  uncertainty,  measurement  noise  (characterized  by 
R),  and  process  noise  (characterized  by  and  Qy)  can  be 
projected  to  the  image  plane  at  a  later  time,  tty, i,  as  expressed 

by  Pzz(ti+1). 

In  summary,  given  the  pixel  coordinates  of  a  stationary 
ground  landmark  at  time  ti ,  the  predicted  pixel  coordinates 
of  the  same  landmark  at  time  ^44  can  be  described  by  the 
bivariate  Gaussian  probability  density  function  given  in  Equa¬ 
tion  (47).  Thus,  the  correspondence  search  for  the  landmark 
can  be  constrained  using  a  statistical  confidence  threshold. 
In  the  following  section,  the  stochastic  projection  method  is 
used  to  predict  the  location  (and  uncertainty)  of  a  stationary 
landmark  in  an  image. 

VI.  Experiment 

The  experiment  validates  the  stochastic  projection  method 
using  both  simulated  and  real  data  collected  from  an  airborne 
system.  In  this  experiment,  a  Northrop  T-38  “Talon”  aircraft 
was  equipped  with  a  day-night  monochrome  digital  video  cam¬ 
era  synchronized  to  a  Honeywell  H-764G  Inertial  Navigation 
System.  The  camera  was  mounted  in  the  cockpit,  pointing  out 
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Fig.  3.  Northrop  T-38  instrumented  with  synchronized  digital  video  camera 
and  inertial  navigation  system. 


the  right  wing.  Flight  data  were  collected  in  Fall  of  2002  at 
Edwards  Air  Force  Base,  California. 

A  Monte  Carlo  simulation  of  the  test  flight  is  performed 
to  verify  the  stochastic  projection  model  with  respect  to  a 
statistically  significant  sampling  of  random  error  contributors. 
While  this  provides  an  indication  of  the  adequacy  of  the  sys¬ 
tem  model,  flight  test  data  are  used  to  verify  the  performance 
of  the  algorithm  in  a  real-world  environment. 

A.  Monte  Carlo  Simulation 

The  performance  of  the  stochastic  projection  method  pre¬ 
sented  in  Section  V  was  verified  using  a  statistically  represen¬ 
tative  ensemble  of  sample  functions  (300  per  run).  The  data 
collection  system  used  on  the  T-38  flights  was  simulated  in 
software,  based  on  a  reference  trajectory  chosen  to  generate 
an  interesting  observation  geometry.  This  constant- altitude  cir¬ 
cular  flight  path  was  constructed  such  that  a  fixed  terrain  patch 
remained  in  the  camera  field  of  view  throughout  the  flight.  The 
simulated  aircraft  speed  was  150  meters-per- second,  altitude 
was  2296  meters,  and  bank  angle  was  27  degrees  which 
described  a  circular  flight  path  with  4592  meter  radius.  The 
resulting  slant  range  to  the  landmark  was  5134  meters.  The 
terrain  elevation  was  simulated  as  a  zero-mean  random  vari¬ 
able.  Simulations  were  accomplished  using  a  terrain  elevation 
error  standard  deviation  of  25  meters,  representing  a  moderate 
accuracy  terrain  model.  All  simulations  used  a  10  second  inter¬ 
val  between  the  first  and  second  image,  which  was  equivalent 
to  18.7  degrees  of  arc  in  the  horizontal  plane.  The  simulation 
geometry  is  shown  in  Fig.  4. 

The  results  are  shown  in  Fig.  5.  In  this  figure,  the  predicted 
pixel  location  errors  for  each  Monte  Carlo  sample  function 
are  represented  by  a  “plus”  symbol.  The  predicted  2-cr  pixel 
location  error  bound  is  indicated  by  a  line.  Note  the  inclined 
elliptical  nature  of  the  2-cr  bound  is  a  function  of  the  trajectory 
and  measurement  geometry. 

The  same  predicted  pixel  location  errors  are  shown  refer¬ 
enced  to  a  256x256  pixel  image  in  Fig.  6.  The  stochastic 
constraint  method  shows  a  small  correspondence  search  area 
which  gives  the  highest  probability  of  the  landmark  location. 
The  stochastic  constraint  method  is  an  improvement  over  the 
epipolar  line  search  method  as  it  provides  a  smaller  search 


Fig.  4.  Simulated  flight  path.  In  order  to  generate  a  good  observation 
geometry,  the  circular  orbit  was  chosen  such  that  a  fixed  terrain  patch  remained 
in  the  camera  field  of  view  throughout  the  flight. 


Fig.  5.  Landmark  pixel  location  error  and  predicted  2-cr  bound  for  25  meter 
terrain  elevation  uncertainty.  Note  the  actual  pixel  location  errors  are  similar 
to  the  predicted  error  bound.  Note:  X  and  Y  axes  have  differing  scales  to  show 
detail. 

area  developed  using  a  statistical  model.  This  results  in  faster 
and  more  robust  correspondence  searches. 

B.  Flight  Data 

In  this  section,  the  stochastic  projection  method  is  imple¬ 
mented  using  image  and  inertial  flight  data  collected  on  the 
T-38  aircraft.  The  aircraft  state  dynamics  are  a  function  of  the 
measurements  from  the  strapdown  inertial  sensors.  All  states 
are  estimated  in  the  Earth-centered  Earth-fixed  reference  frame 
previously  defined.  The  error  equations  were  developed  based 
on  [16],  [17].  For  this  example,  a  three  image  sequence  from  a 
right  turning  profile  is  shown  in  Fig.  7.  The  results  of  the  above 
method  for  predicting  the  future  target  location  and  uncertainty 
are  shown  in  Fig.  8. 

The  target  selected  was  the  west  comer  of  a  building  shown 
in  Fig.  7.  The  estimated  target  location  and  2-cr  variance 
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Fig.  7.  Three  image  sequence  of  an  industrial  area  recorded  during  a  T-38  flight,  with  a  sample  stationary  ground  landmark  identified.  Image  (b)  was  taken 
1  second  after  image  (a).  Image  (c)  was  taken  7  seconds  after  image  (a).  The  aircraft  is  in  a  right  turn  approximately  3.8  kilometers  from  the  landmark. 


Base  Image  ( 2  X  Zoom) 


Image  1  ( 2  X  Zoom) 


Image  2  ( 2  X  Zoom) 


Fig.  8.  Predicted  landmark  location  uncertainty  using  stochastic  projection  method.  The  landmark  selected  was  the  west  corner  of  a  building  in  the  base 
image  (a),  represented  by  the  crosshair.  Using  the  stochastic  projection  method,  the  landmark  mean  and  2-cr  variance  is  projected  into  two  subsequent  images 
to  demonstrate  the  concept.  The  estimated  landmark  location  and  predicted  2-cr  variance  for  image  (b)  shows  an  ellipsoidal  uncertainty  after  one  second  of 
flight.  Image  (c)  shows  a  further  increase  in  the  uncertainty  after  seven  seconds  of  flight.  In  each  subsequent  image,  constraining  the  correspondence  search 
for  the  landmark  to  the  ellipsoidal  region  reduces  the  required  search  area  and  would  eliminate  false  matches  with  other  features  with  a  similar  appearance 
(e.g.,  other  building  corners). 
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Fig.  6.  Landmark  pixel  location  error  and  predicted  2-cr  bound  for  25  meter 
terrain  elevation  uncertainty  referenced  to  a  256x256  pixel  image.  Note  the 
stochastic  constraint  can  limit  the  correspondence  search  area  significantly 
compared  to  a  search  near  the  epipolar  line. 


shown  in  Fig.  8  shows  an  predicted  ellipsoidal  uncertainty 
after  one  second  and  seven  seconds  of  flight.  Note  the  uncer¬ 
tainty  ellipse  increases  with  flight  time,  as  expected.  In  each 
case,  incorporating  camera  motion  information  can  constrain 
the  correspondence  search  space  significantly.  Note  the  true 
landmark  location  remains  consistent  with  the  predicted  2-cr 
uncertainty  ellipse  in  the  presence  of  real  measurement  noise 
and  terrain  model  errors. 
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VII.  Conclusions 

In  this  paper,  a  stochastic  projection  method  to  incorporate 
the  statistics  of  navigation  dynamics  and  target  motion  mod¬ 
els  is  developed  to  project  the  estimated  pixel  location  and 
uncertainty  of  a  landmark  between  two  images.  The  theory  is 
statistically  rigorous.  Thus,  results  derived  from  simulations 
and  actual  flight  data  validate  the  accuracy  of  the  approach 
for  a  number  of  realistic  scenarios. 
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